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DEFENSE— SAN  ANTONIO 


ON  THE  THEORY  OF  SELECTION 


1.  Introduction. 

The  present  paper  is  the  third  of  a  series  of  papers 
on  the  classification  problem.  The  first  paper  is  concerned 
with  the  classification  of  individuals.  The  second  paper,  not 
yet  completed,  will  deal  with  the  classification  of  populations. 

In  this  third  paper  we  shall  consider  a  special  class 
of  classification  problems  with  no  standards  given.  Again  each 
of  s  populations  it,,  ***,  it  is  to  be  classified  into 
one  of  two  categories  (’'good”  and  ’’bad").  However,  what  con¬ 
stitutes  a  good  or  bad  population  is  not  defined  absolutely  but 
in  terms  of  the  quality  of  the  populations  at  hand. 

Problems  of  this  kind  arise  frequently  and  have  re¬ 
cently  been  treated  by  a  number  of  authors  (Hosteller  [l], 

Paulson  [2],  Stein  [3],  Bahadur  [4]).  We  assume  that  we  have 
a  sample  X.  .(0  =  1,  •••,  n. )  from  each  of  the  populations 
TT^  and  that  the  distribution  of  the  depends  on  an  un¬ 

known  parameter  9^.  The  populations  are  ranked  according  to 
the  values  of  the  9‘s.  (For  example,  if  the  9’s  are  real 
valued,  that  one  of  two  populations  may  be  the  better  that  has 
the  higher  9-value.) 

This  work  was  done  at  Columbia  University  and  supported 
in  part  by  the  USAF  School  of  Aviation  Medicine.  The  first 
paper  of  this  series,  referred  to  above,  is  Report  No.  6  of 
Project  No.  21-49-004  (”0n  the  Simultaneous  Classification 
of  Several  Individuals”),  USAF  School  of  Aviation  Medicine. 
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As  has  been  pointed  out  by  Bahadur,  it  is  frequently 
desirable  to  go  beyond  classifying  the  populations  as  good  or 
bad.  If,  for  example,  one  is  looking  for  good  varieties  of 
wheat  to  plant,  one  must  decide  in  what  proportions  to  plant 
them.  However,  this  more  detailed  analysis  is  not  always  re¬ 
quired.  When  it  does  apply,  the  solution  tends  to  be  that  pro¬ 
cedure  which  is  appropriate  when,  rather  than  selecting  a  number 
of  good  populations,  one  is  interested  only  in  the  best  of  them. 

We  shall  here  follow  essentially  the  formulation  of 
Paulson  who  considered  this  problem  for  normal  populations. 

To  be  specific,  let  us  assume  that  the  0's  are  real  valued 
and  that  quality  improves  with  increasing  0.  Let  us  assume 
further  that  there  is  given  a  function  g(0,  ©')  increasing 
in  the  second  variable  and  decreasing  in  the  first,  such  that 
the  population  77^  is  considered  good  provided 

g(©i,  m^X  ©j)  <A 

where  A  is  some  fixed  positive  number.  If  the  variables  are 

p 

normally  distributed  with  mean  0  and  variance  cr,  we  may 
for  example  take  g(9,  ©*)  =  ©  -  0  or  g(0,  ©')  =  ~~  . 

In  the  Poisson  case  we  might  take  g(0,  ©’)  =  ©*/©  and  in 
binomial  case 


g(©,  ©’) 


1 


1 


© 
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where  in  both  cases  ©  indicates  the  mean  of  the  variables. 

As  before  we  shall  adopt  the  point  of  view  of  the 
Neyman-Pe arson  theory  and  ask  that 

(1.1)  the  expected  number  of  bad  populations  classified 

be 

as  good  <  y  , 

A 

Subject  to  (1.1)  we  wish  to  maximize  the  expected  number  of 
good  populations  classified  correctly. 

Condition  (1.1)  has  one  consequence  that  may  seem 
undesirable.  It  follows  from  the  definition  just  given,  that 
there  is  always  at  least  one  good  population,  that  with  the 
maximum  ©.  However,  if  we  impose  (1.1)  we  may  sometimes  have 
to  classify  all  of  the  populations  as  bad.  This  will  occur, 
roughly  speaking,  when  the  observations  indicate  a  situation 
in  which  the  sample  size  is  too  small  to  make  the  selection 
of  the  good  populations  with  the  desired  degree  of  accuracy. 

There  are  two  ways  of  avoiding  this  difficulty.  If  one  knows 
the  order  of  magnitude  of  the  parameters  involved,  one  can  de¬ 
termine  a  sample  size  which  makes  it  very  probable  that  one 
will  be  able  to  perform  the  classification.  Alternatively, 
of  course,  the  situation  points  to  the  use  of  sequential  pro¬ 
cedures.  Such  procedures  also  have  the  advantage  for  problems 
of  the  kind  considered  here  that  they  permit  classifying  the 
populations  gradually.  A  decision  will  be  reached  early  on 
those  that  are  either  very  good  or  very  poor,  while  for  the 
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intermediate  ones  one  may  take  a  larger  number  of  observations. 

A  procedure  of  this  kind  was  discussed  by  Stein  [3]. 

Although  it  is  easy  to  develop  a  general  theory  of 
minimax  procedures  for  the  classification  problem  described 
here,  the  application  of  such  a  theory  to  particular  cases  runs 
into  difficulties  which  the  author  so  far  has  not  been  able  to 
surmount.  We  shall  illustrate  the  situation  with  an  example. 

2  .  An  example . 

Perhaps  the  simplest  example,  and  one  which  was  con¬ 
sidered  by  Paulson,  assumes  that  the  X^,  1=1,  •••,s;  j=l,  ••*,n 
are  samples  from  normal  distributions  with  means  ©^  and  common 
variance  ^  =  1.  Since  the  X^  form  a  set  of  suf¬ 

ficient  statistics  we  assume  without  loss  of  generality  that 

n  =  1  and  denote  our  variables  by  X.,  •••  X  .  We  take 

l7  7  s 

g(©,  ©')  =  ©'-©,  so  that  TT^  is  considered  good  if 

e±  >  T  9j  • A  • 

In  order  to  obtain  the  minlmax  procedure  we  must 
guess  two  least  favorable  dietributiona :  one  that  maxi¬ 
mizes  on  the  average  the  number  of  good  populations  that  are 
classified  as  bad,  and  one  that  maximizes  the  number  of  bad 
populations  that  are  classified  as  good.  One  conjectures  that 
both  are  concentrated  on  the  set 

s  -  1  of  the  ©*s  are  equal,  say  =  © 

the  remaining  mean  =  ©  +  &  • 
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It  turns  out  that  it  is  immaterial  what  value  of  9  we  take. 

Let  us  put  0=0,  and  let  us  assume  that  the  a  priori  distri¬ 
bution  assigns  probability  ^  to  each  of  the  possible  parameter 

sets  (A,  0,  •••,  0),  (0,A,  0),  *  *  *  (°,  0,  A). 

Then  the  Bayes  problem  becomes 

Maximize 

(1.2)  l  ^E^CX)  +  .•••  +  <l>s (X)  I©^  =  A  ,  =  0  if  i  4  i] 


subject  to  the  condition 


(1.3)  |  fc<^(X)  -  b±CO\*±  =A,  ©j  =  0  if  3  4  i 


This  problem  is  solved  by  means  of  a  lemma  to  be 
proved  in  the  paper  on  the  classification  of  populations.  The 
Bayes  solution  sets  (J>^(x)  =  1  if 


*"^xl“^)  "ijr(x2+  *  *  *+xs )  "ijCxo*  ^)  **i(x?+X2+  *  *  *+x^ ) 

e  +  e  ■  2  1  s+... 


2  2.  2. 


>  k 


-  ^(Xg- A)2-^(x^+*  •  *+x|) 
e 


~|(x3-  A  )2-J(xi+xi+x4+  *  *  ,+3!f ) 

+  e  + 


that  is.  if 


A*i  AXa  A  ■  **8 
e  +  e  +•••.+  e 


>  k’  [e**2  +  •  •  •  +  e**s] 
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and  hence  if 


4 

e 


*2 

*+•••+ 


AX  43C. 

e  s  <  C  e  x 


The  solution  for  the  remaining  d>’s  is  obtained  by  symmetry. 

To  complete  the  solution  we  must  now  show  that  for 
this  procedure 

(a)  the  expected  number  of  good  populations  classified  as 

bad  takes  on  its  maximum  when  6  -A*  ©,=•••=©  .  =  0, 

(b)  the  expected  number  of  poor  populations  classified  as  good 

takes  on  its  maximum  when  ©  =A  ©  =...=©  _  =  0  . 

s  *  1  s-1  — 

We  need  to  prove  (b)  not  only  to  show  that  the  pro¬ 
cedure  is  minimax  but  also  to  show  that  the  correct  determina¬ 
tion  of  C  is 


4X-  AXg  AX,  , 

(1.4)  P(e  +  ...  +  e  <  C  e  xj©1  = 


=  ®S-1  =  8S  -4  =  0)  =  I 


The  difficulty  referred  to  in  the  beginning  of  the 
last  section  consists  in  the  proof  of  (a)  and  (b).  We  shall 
now  present  certain  partial  results  on  this  problem. 

Let  us  first  consider  what  happens  for  large  A  • 

Let  us  set  C  =  e^  and  determine  it  by  means  of  (1.4).  If  we 
put  =  X^  -  ©^  (1.4)  becomes 


AYs.i  A(YS+A)  AY1+^ 

•  •  •  +  e  +  e  <  e  )  =  a 
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2 

Dividing  both  sides  by  e  ,  we  see  that  for  large  A  this 
becomes  essentially 


AIS  AY-j+t'  -A  2 
P(e  S  <  e  1  )  = 


a 


Therefore  ,.T “ A...  — >  k  (k>0)  as  A  — ^oo. 

A 

We  shall  first  consider  problem  (a).  As  A  — *oo, 

the  expected  number  of  good  populations  classified  as  good 

when  ©.  =  •••=■  0  _=  o  ,  9  =  A  tends  to  1. 

x  s-i  s 


It  is  clear  that  if  s-1  of  the  populations  are  bad, 
and  only  one  good,  the  probability  of  classifying  the  good  one 
correctly  is  minimized  when  0^  =  ...  =  =  0;  i.e.  when  we  are 

in  the  situation  which  we  believe  to  be  the  minimizing  one. 

We  need  to  compare  this  situation  with  those  in  which 
there  are  more  than  one  good  population  Suppose 

es  =  A  ,  9i  £  °+,  e2>  •**>  es-i  -  A  • 


Let 


A(Y2+62) 
P(e  2  2  + 


A(Ys+9g)  ACY,*©,) 

+  e  <  e 


+  r 


and  let  P2,  ...,  Pg  be  defined  analogously.  The  expectation 
we  are  concerned  with  is  the  sum  of  those  P’s  corresponding 
to  good  populations  and  hence  is  >  Pi  +  Pg. 
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We  shall  now  show  that  uniformly  for 

>  0+,  ©2,  *  •*,  <  A  we  have 


Ps^1 


P1  >  M1  (A) 

where  a-*-*-111  M,  (A)  >0, 

A-*oo  1 

^Ys-l+0s-l^  A(V9s)+* 

P„  =  P(e  1  1  +  +  e  s  1  s  1  <  e  s  s  ) 


>  P(e 


AYn 


AY«  ,  AYs+  £  (A) 

+  e  s  1  <  e  )  — >  oo 


since  If  =  A^  -  k(A)  +  0(A). 


On  the  other  hand 


A(Y +A)  A(Y  +A)  AYT+A2-kA*o(A) 

3X  >  P(e  2  +  ...  +  e  s  <  e  x  ) 


AY9  AY  AYi-kArf-oCA) 

=  P(e  +  •  •  •  +  e  <e  ) 


AY2  AYS 
e  t  * « »+e 

s-1 


j^) 

<  e  x  y 


P^  >  P[ (s-1)  e 


Y9+  •  •  «+Y 
A  2  s-l"5 


<  e 


AYx-kA+o(A) 


Militia  +  iqg.-k-.i2  <  y_  -  k  +  fikl 
s-i  A  1  ^ 
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/Yp+.  »«+Y  ] 

-^p(--s-T-a  «  %  ~  fcJ>  °- 

Unfortunately  we  have  not  been  able  to  show  as  much 
for  the  even  more  important  problem  (b),  and  in  fact  it  seems 
doubtful  that  the  result  (b)  holds  exactly,  even  for  large 

A  A 

7/hat  we  can  show  here  is  that  the  value  9^,  •••,  that 

maximize  the  expected  number  of  bad  populations  classified  as 
good  are  such  that  for  i  =  1,  •••,  s-1 

A 

9^-— ^0_  as  A  — ^oo  . 


The  proof  of  this  is  in  fact  quite  simple.  It  is  obvious  that 

S'  /\ 

®1>  ***»  ®s-l  are  The  expectation  we  are  concerned 

with  is  thus  P-.+ •  ♦  *+Ps_n . 

Consider  now  a  sequence  of  values  9^  tending 
to  a  limiting  value  9^.  Then  it  is  seen  that  regardless  of 

the  values  of  •  ••,  <  0  (but  not  necessarily  uni¬ 

formly  in  these  variables) 


lim 


"f™  pl  =  p(Y  <  Y-,  +  ©i  -  k) 

— 9  00  J-  S  1  1 


Thus  in  the  limit  is  maximized  for  9^  =  0  regardless 

of  the  other  9*s,  and  similarly  for  9^,  •••,  9g  Hence 
the  result  follows. 

This  proves  that  at  least  asymptotically  the  pro¬ 
cedure  has  the  correct  size. 
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